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Abstract We describe a nagging and calibration pipeline intended for making 
quick look images from GMRT data. The package identifies and flags corrupted 
visibilities, computes calibration solutions and interpolates these onto the tar- 
get source. These flagged calibrated visibilities can be directly imaged using 
any standard imaging package. The pipeline is written in "C" with the most 
compute intensive algorithms being parallelized using OpenMP. 
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1 Introduction 



Radio interferometric data taken at the GMRT has traditi onally been an a- 
lyzed interactively using the AIPS data package (but see also lSirothial ( 2009h ). 



This becomes cumbersome when large data sets need to be analyzed, or when 
the data analysis has to be done in quasi real time. We describe here a "C" 
based program which calibrates GMRT data as well as flags data affected by 
interference or by instrumental problems. This package was developed largely 
in the context of two ongoing programs at the GMRT viz. (a) a search for 
transient radio sources for which quasi real time data analysis is needed and 
(b) a search for HI emission at high redshifts for which large volumes of data 
have to be analyzed. The package is designed to give a quick look at the data 
for the first program (in order to determine whether a given burst of radio 
emission can be localized in the sky or not) and to do the first pass of flagging 
and calibration for the second program. However, since it operates on files in 
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a standard format (viz. FITS) and is relatively flexible, it is expected to be 
useful for a larger range of problems than it was specifically designed for. 

Radio interferometers measure "visibilities" , i.e. the Fourier transform of 
the sky brightness distribution. Descriptions of the processe s required to con- 



vert th i s information i nto an image of the sky can be found in lThompson et al 



(j200lf ); iTavlor et all (|l999l) : IChenealur et al.l (|2003h . Here we focus only on 



the initial stages of this process, viz. that of identifying and flagging corrupted 
data, and subsequently determining and correcting for the antenna based com- 
plex gains. 

Data at the GMRT could be corrupted for a variety of reasons, e.g. instru- 
mental failure, radio frequency interference, ionospheric scintillations etc. A 
wide range of interfering signals could be present, including interference that 
is narrow band and persistent (e.g. from local digital equipment), interfer- 
ence that is broad band but bursty in time (e.g. from satellites, aircraft radar 
etc.) low level broad band interference that is persistent in time (e.g. from 
power transmission lines) . A diverse range of approaches have bee n taken for 
i denti fy ing interfering signals i n rad io astronomical d ata (see e.g. Chen galur 
(Il996h: Kanekar fc Chengalurl (119971): lUryashil $200$) : iBriggs fc Kocd (120051): 
iKocz et al.l (|20ld ); lOffringa et al.l (|2010h : IPaciga et al.l (|201ll) V Here we iden- 
tify corrupted data by assuming that the true visibilities should be continuous 
in the time-frequency plane. We identify discontinuities using robust estima- 
tors of the underlying statistics of the visibilities. This approach is well suited 
to finding RFI that is localised in time or frequency, but not low level persis- 
tent RFI. We also identify corrupted calibrator visibilities by requiring that 
the visibility phase for calibrators be stable over time. We note that if the 
calibrator is resolved at some baselines or if there is confusing structure in the 
field of view, the requirement that the phase be stable will not be satisfied. 
However, we find in practice that at the GMRT operating frequencies and 
the generally used calibrators, the visibility phase is stable enough for this 
algorithm to work. Often data corruption affects entire subsets of data. For 
example, there may be narrow line RFI in one channel, or a few contiguous 
channels. Similarly, a particular baseline may have corrupted data because of 
a correlator problem. We identify such subsets of corrupted data by making a 
second pass through the data. Channels, baselines or antennas for which more 
than a specified fraction of the data has been identified as corrupted in the 
first pass are identified and then completely flagged out. We note that general 
purpose packages like AIPS and CASA also have tools that flag data based 
on the assumption that the true visibilities should be continuous in the time 
frequency plane. While the algorithms that are used there are similar in spirit 
to those used here, f lagcal consistently uses robust estimators and the most 
compute intensive calculations are parallelized. It is also worth noting that a 
wrong choice of parameters in algorithms that are based on smoothness in the 
time-frequency plane could lead to over flagging of the data. 

Calculation of the robust statistics as well as of the calibration solutions is 
computationally intensive. Fortunately both calculations can be trivially par- 
allelized over time or frequency. Hence, wherever possible, these calculations 
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Fig. 1 Flow chart for the flagcal pipeline. See the text for more details. 



are computed in parallel using OpenMP pragmas. This gives substantial speed 
ups when running the pipeline on the now standard multi-core workstations. 

Below we give some details on the different algorithms used in the pro- 
gram as well as present sample results and images made using data processed 
through the pipeline. 



2 Details of the Pipeline 

The flowchart for the f lagcal pipeline is shown in Fig. [TJ The main stages 
of the pipeline are detailed below. The program consists of a number of rou- 
tines some of which take several user defined parameters (see Table [2]) . These 
parameters can be specified in a plain text configuration file which is read by 
the program at run time. The configuration file also allows the user to specify 
which of the routines are to be run on a given data set, allowing a great deal 
of flexibility while executing the program. 



2.1 Indexing and Pre Calibration Flagging 



GMRT data is available as a random group UVFITS file. Accordingly, the 
f lagcal program takes a random group UVFITS file as input and its output 
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is a calibrated and nagged random group UVFITS file. The first stage of the 
pipeline reads the input random group fits hie and creates an index table for 
the hie. The index table carries information on the start and stop record, as 
well as the source ID and source type (i.e. flux calibrator, phase calibrator 
or target source) for each scan. By a scan we mean a continuous observation 
of a given source, with flxed receiver and backend settings. A GMRT data 
file typically consists of a series of scans on different (e.g. flux calibrator, 
phase calibrator, target source) sources. The visibility data themselves are 
stored internally in the form of a row-major multi dimensional array. The 
size of a typical GMRT data is less than 4 GB. The current algorithm hence 
keeps all of the indexed data in memory. However, for large files (greater than 
8GB) the algorithms can be easily modified to process the data scan by scan. 
This would keep the memory requirements modest while also minimising the 
amount of disk I/O. Each element of the array is a complex3(i.e. consisting 
of real, imaginary, and weight fields) structure. Data with negative weight are 
regarded as being flagged. A typical visibility data point for baseline b, stokes 
parameter s, frequency channel c and record r is indexed as V[b +nb(s+ n s (c+ 
n c x r))], or V r (c, s, b) where rib, n s and n c are the total number of baselines, 
stokes, and channels respectively. 

The next stage of the pipeline does pre-calibration flagging. Several types 
of flagging are possible. The hrst is a simple thresholding or clipping, where 
all visibilities whose amplitudes lie above a user defined threshold are clipped. 
If the user has some prior information on corrupted baselines, antennas or 
channels, this can be passed to the pipeline in the form of a flag hie. Clip- 
ping and initial flagging (on the basis of the input flag file) is followed by two 
stages of MAD filtering^. A MAD filter flags all visibilities whose amplitudes 
differ from the median amplitude by more than a user dehned threshold times 
the median absolute deviation (MAD). This filter has been chosen because 
of its robustness (i.e. the median and the median absolute deviation are ro- 
bust to the presence of outliers, unlike the mean and the standard deviation). 
The two MAD filter steps before calibration are a Global Mad Filter and a 
Pre Mad filter. The Global Mad Filter flags the data points for which the 
visibility amplitude is discrepant as compared to the the global median and 
MAD. These statistical parameters are computed over all baselines, channels 
and stokes parameters present in the data file for that given source. For con- 
structing the Global Mad Filter, the visibility data array Ai where i varies 
from to n r xnbXn c xn s is split into n r sub-arrays, one for each time sample. 
The median M( and mad M| are computed for each sub array d. For each 

The median absolute deviation or MAD, which we refer to as M2 in this paper, is defined 
by the median of the absolute deviation around the median value. 
2 For an array {x} = (xi, X2, X3, ) 

Mi = median(a;i , X2, X3, ....) 

and 

M 2 = median(|xi - M x \, \x 2 - Mx\,\x 3 - Mi\, ....) 
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source we collect the different m{ and m 2 to construct the global median M\ 
and MAD M 2 for that source. All visibilities for source k which satisfy 

abs(\V r (c, s, 6)| - M* ) > thrs_gmad x Af 2 (1) 

are flagged, where thrs_gmad is a user defined threshold (see Table [5]). Since 
the Global Mad Filter takes into account all the data points available for a 
source, it is expected to be very robust. 

In contrast to global mad filtering, pre mad filtering is done only on the 
basis of visibility data of the specific channel, stokes parameter and scan. This 
filtering does not combine data from different channels, baselines or stokes pa- 
rameters i.e., it works only in time domain. In the Pre-Mad Filter all visibilities 
that satisfy the following condition are flagged. 

abs(|K(c, s, 6)| - Mi (c, s, 6)) > thrs.pmad x M 2 fe ((c, s, 6) (2) 

where thrs_pmad is a user defined threshold value and where M\ (c, s, 6) and 
M 2 (c, s, b) are the median and mad values of the visibilities which are com- 
puted separately for every source k, channel c, baseline b and stokes s. 



2.1.1 VSR and ABC flagging 



For observations of an unresolved point source calibrator, one would expect 
that, in the absence of RFI or instrumental problems, the phase of the visi- 
bility for any given stokes, baseline and channel would vary slowly with time. 
Consequently, for sufficiently small stretches of time, (typically 1-2 min at the 
GMRT) one would expect that the ratio of the amplitude of the vector sum 
of the visibilities to the scalar sum of the visibilities (the "vector to scalar ra- 
tio" , VSR) would be close to unity. In this stage of the pipeline, the calibrator 
visibilities are broken up into blocks (of size nf lag decided by the user) and 
for each block the ratio TV (c, s, b) define below is computed. 



K j (c,s, b) = 



V r (c,s,b)w r (c,s,b) 



(3) 



|V r (c, s, b)w r (c, s, 6)1 



Where V r (c,s,6) and w r (c,s,b) are the complex visibility and weight 
respectively for the flagging block j which starts at record r J start and ends at 
record r 3 end . A flagging block is considered good only if it has a VSR 7\L J (c, s, 6) 
that is greater than a user defined threshold thrs_vsr . Clearly, the expected 
value of TV (c, s, 6) is unity for the case when the phase of the visibility remains 
constant and is zero when the phase varies randomly between zero and 2tt. 
We note that this test is applied only on the calibrator scans, and not on the 
target source scans. 
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Fig. 2 Plots of the ratio of the modulus of the vector sum to the scalar sum (VSR, see 
Eqn. [3} of the visibility for the calibrator scans. For good data the VSR is expected to 
be 1.0. All data points for which the VSR is smaller than a user given threshold (0.95 in 
the above case) are flagged and are shown by cross (x) symbol. Data points which are not 
flagged are shown by plus (+) symbol. 



In Fig. [5] is shown the VSR for a typical frequency channels for few base- 
lines. Data blocks for which the VSR is less than a user settable threshold (0.95 
in this case) are nagged. The default value of nf lag is 2min. This value was 
iteratively determined after running the program on several different GMRT 
data sets. 

RFI or instrumental problems often affect a finite subset of the data, for 
example, a particular channel may be bad, or a particular baseline or an- 
tenna. In order to identify such subsets of corrupted data, the VSR W(c, s, b) 
is marginalized over two of its indices to obtain a set of normalized "mea- 
sures of the badness" A, B and C for all the antennas, baselines and channels 
respectively. Specifically, for each calibrator scan i we compute: 



1 n c n s lib 

■4 = jf EEE r ^ s < & ) t^(p - o + m« - o] (4) 

c— 1 s—X 6=1 

- n c n s 



C=l 8=1 

n b 7i s 



^^EE^^) ( fi ) 



b=l s=l 
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where the function Sd is unity when its argument is zero, otherwise it is 
zero, p, q are the indices of the antennas that comprise the baseline b. We 
normalise A,B and C with Na = n c x n s x (n a — 1),Nb = n c x n s , and 
Nc = fib x n s respectively so that their values are constrained to lie between 
zero and one, i.e. they represent the fraction of data that has been identified 
as corrupted in the earlier pass through the data. T l {c,s,b) takes values of 
either or 1. It is set to only if one of the two conditions below are satisfied. 

1. The number of "good blocks" (i.e. blocks for which the value of 1Z is greater 
than the user defined threshold thrs_vsr ) is larger than a user defined 
threshold value thrs_ngood. Note that thrs_ngood is in units of the 
flagging block length nf lag. Multiplying thrs_ngood by the time duration 
of the flagging block length would give the corresponding time interval. 

2. The length of the longest contiguous time interval in the scan for which 
all the flagging blocks are "good" (i.e. as defined above) is greater than a 
user defined threshold value thrs_block. As above, this is in units of the 
flagging block length. 

The idea behind the second test is that sometimes the problem causing 
the data to be corrupted (e.g. hardware failure) occurs partway through the 
scan. In that case there would be a contiguous stretch of time for which the 
data quality is good. This can be distinguished from an intermittent prob- 
lem because if the problem is intermittent the probability that nonetheless 
thrs_block contiguous blocks remain unflagged is small. Note that we carry 
out the test (2) on a scan only when the test (1) fails. 

As mentioned above, we apply two passes of flagging for the calibrator 
scans, the first one based on 1Z (where individual flagging blocks are flagged) 
and the second based on A, B and C (where entire antennas, baselines or 
channels are flagged). Antennas for which more than thrs_ant of the data is 
already flagged, baselines for which more than thrs_base of the data is already 
flagged and channels for which more than thrs_chan of the data is already 
flagged are flagged in this second pass. Note that these thresholds represent 
fractions of the total data for that antenna, baseline or channel. 

For the target source, the assumption that the visibility phase varies slowly 
with time need not be satisfied. However, it is often the case that a data subset 
(e.g. a channel or a baseline) that is corrupted during the calibrator scan, is 
also corrupted during the target source scan. Hence the user is allowed to 
interpolate the flags from the calibrator scans on to the target source scans. 
Specifically for the target source scans i, the values of A, B, C are computed 
using 

X 1 = max{X l -\X t+1 ) (7) 

where X can be A, B or C. The above interpolation scheme assumes that 
there is a calibrator scan before and after every target source scan, which is 
typical for most GMRT observations. In case the target source scan is not 
bracketed by calibrator scans, the interpolation is done from the nearest cali- 
brator scan. 
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Fig. 3 This figure shows the measure of badness for antennas A for different calibrator 
scans (shown by different line styles). The value of .4 is close to one for antennas which arc 
" bad" and it is close to zero for antennas which are " good" . 
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Fig. 4 This figure shows the measure of badness for different channels C for different 
calibrator scans (shown by different line styles) . The value of C is close to one for channels 
which arc "bad" and it is close to zero for channels which are "good". 
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2.2 Gain Computation and Calibration 



2.2.1 Gain Computation 

In situations where baseline based errors can be ignored, the relation between 
the observed visibility Vij and the true visibility Vij can be written as: 



Vij = gtg j V i3 



(8) 



where gi is complex gain (i.e. the overall gain, including instrumental and 
ionospheric contributions) of antenna i. The goal of a calibration program 
is to use the observed visibilities Vij to determine the gains gi (which are 
expected to vary with frequency, stokes parameter and time) and hence the 
true visibilities Vij . In "ordinary calibration" , which is the scheme implemented 
here, it is assumed that the gi, , do not vary much with time. This allows one 
to compute gi using the observations of the calibrators (for which the true 
visibilities are known) and interpolate these on to the visibilities of target 
source. 

As is well known, Eqn. [5] describes an over constrained system since one has 
a total of n a (n a — l)/2 complex visibilities Vij from which one needs to deter- 
mine n a complex gains. In the f lagcal implementatio n an iterati v e met hod 
of solving for gi by least square minimization (see e.g Bhatnaearl ( 200ll )) is 
followed. The estimated complex gain of gf at iteration n is given by: 



alpha x 



E 



Els; 



n-l|2 



in. 



9i 



(9) 



where Xy = Vij /Vij. We use the following convergence criteria for the above 
iterative solutions 



MaX \ \g^\ ) <6PS (10) 

where the maximum is computed over all antennas. As formulated above 
the solutions gi are independent for each frequency channel, stokes and record, 
and this algorithm can hence be trivially parallelized over any of them. In the 
f lagcal implementation the calculation is parallelized over frequency chan- 
nels. The solution is computed after averaging the data over time; the length of 
the solution interval nsol and the maximum number of iterations niter can 
be set by the user. The typical solution interval is 2m and convergence is typ- 
ically obtained in less than 10 iterations. The computed gains compare well 
with those computed independently by tasks in the AIPS package. The cali- 
bration is estimated to be accurate at about the 10% level. The residual errors 
are due to a combination of systematic errors (for e.g. any elevation dependent 
gains are not corrected for) and do not reflect the precision of the calculations. 
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2.3 Post Calibration Flagging 



After calibration, the visibilities on the calibrators should have identical am- 
plitudes and zero phase, within the noise. Data which is slightly corrupted will 
hence be easier to detect than before calibration. We hence allow for a stage 
of post calibration MAD filtering, in which all visibilities V r (c, s, b) for which 

abs(|K(c, s, 6)| — Mi(c, s, r)) > thrs_qmad x M 2 (c, s, r) (11) 

are flagged. thrs_qmad is a user defined threshold, and Mi(c, s, r) and 
M2(c, s, r) are the median and mad for record r, channel c, stokes s and base- 
line b. The intention is that the calibration step can then be repeated, leading 
to more robust gain solutions, although this is currently not yet implemented 
in f lagcal . 



2.3.1 SmoothSubtractThreshold (SST) Flagging 



As discussed in Sec. 12.1.11 the expected visibility for the target source is not 
known a priori, and hence the flagging algorithms suggested there may not 
be appropriate for target source visibilities. However, even though one does 
not know the expected visibilities for the target source, it is reasonable to 
assume that the visibilities do not vary very rapidly with either time or fre- 
quency. One could then try and identify corrupted data by either looking 
for deviant p oints after polynomial fitting to the visibi l ities, o r smoothing th e 
data (see e.g.lChengalurl (jl996l) : lKanekar fc Chengalurl (|l997h : IUrvashil (|2003h : 
Offringa et al.l (|2010|)). In f lagcal the latter approach is followed. The data 



is smoothed over one dimension (either time or frequency) and then visibili- 
ties that differ from the smoothed value by more than a threshold amount are 
flagged. This can be repeated for nriterJC iterations (where X can be time t 
or frequency f) with the smoothing size increasing with each iteration. Specif- 
ically, for a set of n visibilities Xi one first computes the smoothed visibilities 
over a window of length 2wj + 1 , viz. 



X 



3 _ 



for < i < w-j 



2uy + l 
2ui,+l S X i+j 

j=o 

i-\-Wj 

2^ £ x l+j for Wj < i <n-Wj 

j=l-Wj 





2wj+l 



E 



Xi+j for n — Wj < i < n 



(12) 



j = -(2 Wj +l) 

then subtracts the smoothed data from the original 



YP 



^ - x\ 



(13) 



Data points for which 



abs(^ J - Mi) > thrs_peak_X J ' x M 2 (14) 
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Module name 


1 


2 


4 


8 


Global Mad filter 


205.68 


108.08 


66.93 


41.45 


Pre-Mad filter 


48.49 


25.66 


15.23 


9.17 


Gain computation 


81.56 


53.58 


31.51 


23.45 


Peak in time 


218.94 


118.76 


59.06 


28.64 


Peak in frequency 


202.61 


101.94 


55.39 


27.89 


Post Mad filter 


45.79 


27.84 


20.01 


16.88 


Total time 


803.07 


435.86 


248.15 


147.48 



Table 1 This table shows the time taken in seconds by various modules for 1, 2, 4 and 8 
threads (column 2, 3, 4 and 5 respectively). The last three rows show the total time taken 
(without considering the time take by read/write and other serial sections). For timing we 
used a visibility data set with 128 frequency channels, 2 stokes, 435 baselines and 1506 time 
samples (16 second sampling). The file size is ~ 2GB and it takes ~ 73s to read in the file 
from disk. 



where Mi and Mi are the median and median of absolute deviation for the 
series Xi are flagged. The user can choose a range of window sizes Wj, and 
thresholds T J and the number of iterations nriterJK (see table [2] with X 
replaced by t for time and / for frequency), over which this process can be 
repeated. As before this operation is parallelized using OpenMP pragmas. 

The width of the smoothing window w J and threshold thrs-peakJC-? at 
iteration j is given by : 

up = (dwin_X) J x win_min_X (15) 

and 

thrs_peak_X J = ( ) thrs_peak_X (16) 

where descriptions of win_min_X, dwin_X, dpX and thrs_peak_X are given in 
Table [5] (with X replaced by t or /) with their typical values. 

We note that running this procedure on the target source could result in 
genuine emission from very strong transients being filtered out along with 
undesired RFI. In such situations, it would be better to disable this flagging. 
In the more general case, where one is searching for transients that are too 
faint to show up in the visibility of a single channel of a single baseline, this 
algorithm could be used. 



3 Results 

The f lagcal programs have been tested several GMRT data sets. The pro- 
gram works well at removing RFI and calibrating for all the GMRT bands 
with minimal tuning of user input parameters. The default values of the pa- 
rameters have been set after running the programs on various data sets and 
examining what data was flagged. The fact that many of the parameters are 
expressed in terms of the underlying robust statistics of the data also reduces 
the necessity to fine tune them for each data set. In Fig. [5] is shown a time 
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Fig. 5 [A] The top panel shows shows a time-frequency gray plot for a baseline for a typical 
GMRT observation at 157 MHz. The vertical stripes at constant intervals in the figure are 
the scans for phase calibrator. The presence of a large number of structures shows that the 
baseline is a heavily corrupted due to RFI. [B] The bottom panel shows the same data after 
processing through f lagcal . Note that the RFI has been flagged out, and the intensity 
scale also has become narrower. The wedge on right side shows the intensity scale in Jy. 
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Fig. 6 A 157 MHz image obtained after processing the data with f lagcal . A snapshot of 
the processed visibilities from which this image has been made is shown in Fig. [5] Note that 
the only calibration applied is that computed by f lagcal . The image was produced using 
the AIPS task IMAGR and has been CLEANed, but not self calibrated. 



frequency plot of visibilities on a given baseline taken from a GMRT 150 MHz 
observation. This data set has been selected for illustration because at GMRT 
the 150 MHz band is generally worst affected by interference. Panel [a] shows 
the raw visibilities, while Panel [b] shows the visibilities after being processed 
through f lagcal . As can be seen, the strong RFI visible in Panel[a] has been 
removed and the calibration of the data also reduces the variation in the inten- 
sity. About 30% of the data has been flagged out, note that this also includes 
data from antennas that had problems with their servo systems etc. 

An image made with this processed data is shown in Fig. [5] The image 
was made using tasks in the AIPS package. No further calibration or flagging 
was done in AIPS. Instead the data was imaged and cleaned using the task 
IMAGR. The UVRANGE was set to 0-10 kL and the UVTAPER to 8 kL 
resulting in a resolution of 30 x 27 . Given that the pipeline is intended only 
for a quick look at the data we do not do more detailed cross checks. We 
postpone these for a later stage as the pipeline becomes more mature. 
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Table [5] lists the parameters available to the user in tuning the performance 
of the f lagcal program. The execution times for the various modules on an 
AMD Opteron (2.6GHz, 8 GB RAM) with dual processors, each with 4 cores 
is given in Table [1] 

In summary, we present the initial results from f lagcal a pipeline aimed 
at doing initial flagging and calibration of GMRT data. The aim of this version 
of the pipeline was to pre process the visibility data sufficiently in order to 
allow the user to make a quick look image of the data. We demonstrate this 
using a GMRT 150 MHz data set, a frequency band where RFI problems are 
generally severe. 
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Modules and their parameters 


Module name 


Parameter 


Description of parameters (with reference) 


Gain solution 


nsol(4) 
niter(50) 
alpha(O.l) 
eps(O.Ol) 
ref ant(l) 


number of record used for eain solution (sec t|2. 2 . 1 D 
number of iterations for gain solution (see 42.2.111 
a coefficients (see Eqn (0) 
used for checking convergence (see Eqn JTOjl) 
reference antenna (which has phase zero) 


Clip 


thrs_clip(1000.0) 


threshold used for clipping 


Global mad filter 


thrs_gmad (9) 


threshold used for filtering (see Eqn iTO) 


Pre mad filter 


thrs_pmad (9) 


threshold used for filtering (see Eqn l[2jl) 


VSR/ABC flagging 


nflag(6) 

thrs.vsr (0.95) 

thrs_ngood(0.7) 

thrs_block(0.5) 

thrs.ant (0.3) 

thrs_base(0.3) 

thrs_chan(0.3) 


number of samples used for VSR (see 42.1.1|para 1) 
threshold for VSR fsee Eqn J2.1.1H 
threshold for bad block fsee Doint 1 in 42.1.11) 
threshold for bad block fsee point 2 in 42.1.11 
threshold for bad antennas fsee Ean 12.1. 11) 
threshold for bad baseline fsee Eon 12.1.111) 
threshold for bad channels fsee Eqn 12.1. H) 


Post mad filter 


thrs_qmad(9) 


threshold for post mad filtering (see Eqn Hill ) 


SST flagging in time 


nriter_t(4) 

thrs_peak_t(9.0) 

dpt(0.7) 

win_min_t(2) 

dwin_t(2) 


number of SST iterations fsee 42.3.111 
starting threshold for SST (see Eqn 11611 ) 
factor for decreasing SST threshold (see Eqn 11611 ) 
starting smoothing window for SST (see Eqn 115H) 
factor for increasing SST window (see Eqn 11511) 


SST flagging in frequency 


nriter.f (4) 
thrs_peak_f (9.0) 
dpf(0.7) 
win_min_f (2) 
dwin.f (2) 


number of SST iterations fsee 42. 3. in 
starting threshold for SST (see Eqn 11611 ) 
factor for decreasing SST threshold (see Eqn 11611 *) 
starting smoothing window for SST (see Eqn 11511 "1 
factor for increasing SST window (see Eqn JT51) 



Table 2 The Flagging and Calibration pipeline FLAGCAL consists of modules which arc 
used for calibration and flagging. Every module has its set of parameters and the values of 
parameters control the effectiveness and accuracy of the module. In most cases the default 
(typical) values of the parameters can be used, however, in some case the tuning of param- 
eters may be required. In the above table the fist column gives the names of modules, the 
second and the third columns list the parameters and give their description respectively. 
We also give the typical values of the parameters in the second column (in bracket) and the 
reference in the text where these parameters arc discussed. The names of the parameters in 
the above table are identical to that are given in the code. Note that the typical length of 
a GMRT records is 16 seconds. 



